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STABLE MARKED POINT PROCESSES 
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In many contexts such as queuing theory, spatial statistics, geo- 
statistics and meteorology, data are observed at irregular spatial posi- 
tions. One model of this situation involves considering the observation 
points as generated by a Poisson process. Under this assumption, we 
study the limit behavior of the partial sums of the marked point pro- 
cess {(ti, X(U))}, where X(t) is a stationary random field and the 
points ti are generated from an independent Poisson random mea- 
sure Noni . We define the sample mean and sample variance statis- 
tics and determine their joint asymptotic behavior in a heavy-tailed 
setting, thus extending some finite variance results of Karr [Adv. in 
Appl. Probab. 18 (1986) 406-422]. New results on subsampling in the 
context of a marked point process are also presented, with the appli- 
cation of forming a confidence interval for the unknown mean under 
an unknown degree of heavy tails. 

1. Introduction. Random field data arise in diverse areas such as spatial 
statistics, geostatistics and meteorology, to name but a few. It often happens 
that the observation locations of the data are irregularly spaced, this being a 
serious deviation from the typical formulation of random field theory, where 
data are located at lattice points. One effective way of modeling the observa- 
tion points is by means of a Poisson process. In [3, 4], the statistical problem 
of mean estimation is addressed given a marked point process structure of 
the data where the observation locations are governed by a Poisson random 
measure N assumed to be independent of the distribution of the stationary 
random field itself. Karr [4] obtained central limit theorem results for the 
sample mean in this context, under a finite second moment assumption, and 
showed that the limiting variance depends on the integrated autocovariance 
function. The paper at hand is a first analysis of infinite-variance marked 
point processes. 
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Within the literature on dependent, heavy-tailed stationary time series, 
the discrete-time stochastic process 



has been studied in [9]. In this work, t is the integer index of the discrete-time 
process, M is an a-stable random measure with Lebesgue control measure 
and tp is a sufficiently regular real- valued function. This is an excellent model 
for the types of data discussed above because X(t) can be defined for any 
t in R rf when ip and M are also extended to M d . A marked point process 
can be defined using model (1); data from a marked point process are of the 
form {ti, X(ti)} for i = 1,2, ... , where the points ti,tz,... are generated by a 
Poisson random measure. We focus on the non-Gaussian case where a < 2 so 
that the variance of X is infinite. Nevertheless, we study the sample variance 
statistic since it forms a suitable studentization of the mean; see [5]. The 
sample variance (as well as its square root, the sample standard deviation) 
is always well defined, even though the true variance may be infinite. 

The second section of this paper develops some theory on continuous-time 
stable processes and the convergence of integrated partial sums of the data 
is established. This is relevant to our main discussion since a continuously 
observed process must be defined before we can conceive of a marked point 
process, the random field having to be well defined at every observation 
point t. Since the limit results for continuous-time stable processes are new 
and helpful for our marked point process problem, they are also included. 

In the third section, we describe the marked point process situation and 
derive the joint asymptotics for the sample mean and sample variance. As 
expected, the limit is stochastic, but its randomness only comes from the 
stable random measure M, not from the Poisson random measure N. It will 
be seen that our results generalize the limit theory of Karr [4] to the case of 
infinite variance. 

Finally, it is well known that subsampling is applicable in the context of a 
marked point process under some conditions; see Chapter 6 of [8], hereafter 
denoted PRW [8]. Our limit theorems permit us to verify the subsampling 
requirements, so a valid confidence interval for the mean is constructed in 
Section 4. Two methods are presented, one based on the asymptotics of the 
sample mean when a is known and one based on the asymptotics of the self- 
normalized sample mean when a is not known. These methods are tested 
and compared by means of a simulation study in Section 5. All technical 
proofs are in Appendix A. 

2. Continuous parameter processes. In this section, we develop a limit 
theory for the sample mean and sample variance of an a-stable continuous- 
time random field {X(t),t E M d }. Consider an a-stable random measure M 
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with skewness intensity /?(•) and Lebesgue control measure (denoted by A) 
denned on the space M. d . The random measure is independently scattered 
and for Lebesgue-measurable sets A, the distribution of M(^4) is a-stable 
with scale X(A) 1 ^ a , skewness f A P(x)X(dx)/X(A) and location 0; see [11], 
page 118 for details. This choice of control measure reflects our desire that 
the process be strictly stationary; translation invariance is a necessary con- 
dition for stationarity in such models. In addition, it is necessary for j3 to 
have period "zero" for stationarity, that is, the skewness intensity (3{x) is 
constant as a function of x. We will denote this constant by (3. Let ip be a 
filter function in L5 := {/ : \\f\\g := J Rd \f(x)\ s X(dx) < 00} that is continuous 
and bounded for almost every x with respect to Lebesgue measure. Then we 
may construct the following stochastic integral with respect to an a-stable 
random measure M (see [11]): 

(2) X{t)= [ ip{x + t)M(dx), 

where t € M. d . We stipulate that the number 5 is in (0,a] n [0,1] (later, 
we require ip to be integrable); a will be fixed throughout the discussion. 
Note that a = 2 corresponds to a Gaussian stochastic process and has been 
extensively studied. It is a known fact that the random variable X(t) is 
a-stable with scale parameter 

<ty = mx + t)\ a X(dx)^j " = (J \t/,(x)\ a \(dx)\ a 
and skewness parameter 

S^(x + t)^f3(x)X{dx) J R ^(x)^X(dx) 
Pi> = ~z = p — a . 

where = sign(6)|6| a . The location parameter is zero unless a = 1, in 
which case it is 



2 f 

H& = — / ip(x + t)j3(x)log\tp(x + t)\X(dx) 
2(3 f 

= / ip(x) log \tp(x)\X(dx). 

7T JRd 

Intuitively, we may think of X as the convolution of ip and M, in anal- 
ogy with the infinite order moving average of classical time series analy- 
sis. Essentially, one runs the independent-increments a-stable measure M 
through the linear filter ijj and the resulting time series is strictly station- 
ary with nontrivial dependence; two random variables X(t) and X(t + k) 
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are independent if and only if the lag k exceeds the diameter of ip's sup- 
port. Of course, ip need not be compactly supported, in which case all of 
the variables are dependent. Hence, this construction makes for an inter- 
esting and relevant linear heavy-tailed model. As shown in Proposition 1 
of [6], the model defined by (1) is well defined and stationary. That is, 
for each t, the random variable X(t) is a-stable with location zero (unless 
a = 1, in which case the location is /i^>), constant skewness and constant 
scale. 

We will be interested in the asymptotic distribution of the partial sums. In 
order to consider fairly general, nonrectangular regions, we let K be a "pro- 
totype region," that is, a Lebesgue-measurable set in M, d with measurable 
boundary dK such that X(dK) = 0; see [7] for background on this concept. 
Then let K n = n ■ K, which essentially scales the prototype region by the 
integer n. Our statistics are computed over K n and our asymptotic results 
are achieved as n — > oo . This device allows us to consider nonrectangular re- 
gions and, at the same time, is a realistic construction. The appropriate rate 
of convergence for the partial sums will then be n a , as shown in Theorem 1 
below. We wish to average X{t) over all points in K n , so to that end we 
must calculate 

(3) / X(t)\(dt). 

Now a discrete sum of the random field is certainly well defined by the 
linearity of the a-stable random integral, but it is not a priori clear that (3) 
makes sense. Thus, we introduce the following definition. 

Definition 1. By expression (3), we mean the limit in probability as 
m — > oo of 

(4) x (u)&u, 

where K™ is a mesh of m points ti in K n , Ati is the dX volume of the 
elements of the mesh and the mesh gets progressively finer as m is increased 
(this is the usual 'Riemann sums' construction). 

Let us establish that this definition makes sense. By using the linearity 
of the stable integral, (4) becomes 

(5) / V tp(ti + x)AtM(dx)= [ F m (x)M(dx), 

where F m (x) = J2ti£K m ^{pi + x)Ati. Now, it follows that the limit in prob- 
ability as m — > oo of (5) is J Rd F(x)M(dx) for F(x) := f Kn ?p(t + x)\(dt), 
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provided 

\F m (x)-F{x)\ a X(dx)^0 



as m — > oo. Since the integrands are bounded in L , we may apply the 
Lebesgue dominated convergence theorem and obtain our result since F m (x) - 
F(x) pointwise. 

Thus, we have established that expression (3) makes sense and also that 
it is equal to 



(6) / F(x)M(dx)= / i/>(t + x)\(dt)M(dx). 

jR d JR d JK n 

In a similar fashion, one can define the integral of the second moment, 



(7) / X\t)X{dt), 

JK n 

as a limit in probability of a Riemann sum of X 2 (t); unfortunately, due 
to squaring, a nice representation such as (6) is not possible. However, the 
Laplace transform of (7) is closely related to the Fourier transform of 

(8) / / i/)(t + x)M(dt)M(dx), 

jR d JK n 

where B is an independent Gaussian random measure. This is shown in the 
proof of Theorem 1. Note that, conditional on B, (8) is an a-stable random 
variable with scale 

a \ 1/a 

X(dx) 



ip(x + t)B(dt) 

Now, we are interested in the continuous versions of the sample mean and 
sample variance; it is sufficient to examine the asymptotics of 

(rri f X(t)X(dt),n-% [ X 2 (t)X(dt) 

V JK n JKn 

which we explore through the joint Fourier-Laplace transform. For two ran- 
dom variables A and B > 0, this is defined by 

cf) AtB (9, 7 )=Eexp{ieA-jB}, 6»eR,7>0. 

Like the joint characteristic function, the pointwise convergence of (pA n ,B n to 
a function which is continuous at (0, 0) establishes joint weak convergence 
of A n and B n ; see [2]. The next theorem gives a complete answer to our 
inquiry. 
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Theorem 1. Consider a random field defined by the model given by (1), 
where < a < 2. Then the sample first and second moments jointly have 
limit 

(9) fn"a / X(t)\(dt),n-% [ X 2 (t)X(dt)) ^(S^a), U^a)) 

as n — > oo. Here, U^a) is nondegenerate only if a < 2. 5oo(a) is an a- 
stable random variable with scale parameter |$|, where $ = J Rd ij)(x)X(dx), 
and skewness parameter (3 ■ sign(<I>). If a 7^ 1, the location parameter is zero; 
otherwise, it is — log |<J>| . When either or a = 1 and /3 = 0, we may 

write 800(a) = X(K) l l a Q ■ M.(B) for the marginal distribution. For a < 2, 
[/00(a) is an a/2-stable random variable with scale parameter 

2[X(K)E\G\ a ] 2/a / -0 2 (2OA(dx)(cos(7ra/4)) 2/a , 



skewness 1 and location 0, where G is a standard normal random variable. 
If a = 2, then Uoo(a) is a point mass at the second moment of X, which is 
2 J K d ip 2 (s) X(ds) . We may write the marginal distribution as 

[/00(a) = 2[A(A-)E|GT] 2/a / ip 2 (x)X(dx)(cos{ira/4)) 2/a • e(a), 



where e(a) is an a/2-stable subordinator. The limiting joint Fourier- 
Laplace transform Eexp{i6>S' 00 (a) — 7^00(0;)} for a < 2 is [letting $2 = 

fw* ^ 2 {x)X(dx)} 



iX(K)^-l a=1 E \\0<f> + y/2j$ 2 G) log 6$ + y/¥f$ 2 G 



for 6 real and 7 > 0. 



Remark 1. One can develop confidence intervals via this result. How- 
ever, due to considerations of space, we will give the details only in the 
marked point case, the continuous case being much simpler. 



3. Marked point processes. We will now consider the more intricate sit- 
uation wherein the observation locations of the random field are themselves 
random. It often happens in statistical problems that random field data is 
not observed at lattice points, but instead at points scattered around the 
observation region with no discernible pattern. Frequently, we can model 
this situation through the employment of a random measure for the point 
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locations. Generally, this probabilistic structure is referred to as a marked 
point process N: 



N = J2 € (T l ,X(T l )) 



for e x {A) = 1 if x £ A and equals otherwise. See [4] for a treatment of 
marked point processes for L2 random fields. If we wish to impose the con- 
dition that the distribution of points does not depend on the location of the 
observation region, only on its size and shape, then we say that the random 
measure is spatially homogeneous — this is similar to a stationarity assump- 
tion. Also, it is often sensible to assume that the distribution of points in 
one observation region is independent of the distribution of points in an- 
other disjoint observation region — the "independent scattering" property. 
It turns out that a homogeneous Poisson random measure (PRM) satisfies 
these properties, and is therefore a reasonable model in many scenarios. 
So, let N denote a PRM with mean measure A. N is sometimes denoted 
PRM (A), as explained in the following: 

Definition 2. We say that N is PRM (A) on the measure space {R d ,B, A} 
(where B consists of the Borel sets in W 1 ) if and only if it is an independently 
scattered, countably additive random measure that satisfies 



for every A £ B := {B C B : A(A) < 00} — in other words, N(A) ~ Vois(A(A)). 
We call A the mean measure of N. 

Remark 2. More generally, we could just define the mean measure to 
be A(-) := EN(-). Now, if we impose the condition that A be a translation- 
invariant measure on M. d , then spatial homogeneity follows immediately 
from (11). Of course, A must then be Lebesgue measure (denoted by A, 
as in the previous section) modulo some constant positive multiplicative 
factor, that is, A = r\ for some r £ M + . 

We are interested in investigating the limit behavior of the sample mean 
and sample variance over the observation region K n (we preserve the nota- 
tion from the previous section) , where the data locations are now determined 
by the random measure N, which is independent of the stochastic process (1). 
Thus, we wish to study the joint convergence of 



(11) 



P[N( J A) = k] = exp{-A(^4)} 



A(,4) fc 
~~ kT~ 



k = 0,l,... 



(12) 




as n — > 00. Note that N(K n ) is the actual observed sample size. 
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Theorem 2. Consider a continuous-parameter random field generated 
from the model given by (1), where < a < 2 and the skewness intensity [5 is 
constant. Suppose that a PRM N with mean measure A = rX, independent of 
the stochastic process, governs the distribution of observation locations. If the 
observation region is the set K n and a < 2, then the normalized sample mean 
and sample variance computed over the observation region jointly converge 
in distribution to an a-stable random variable S'oo(a) and a positive a/2- 
stable random variable U^a) as n^oo: 

(N{K n )-^ [ X(t)N{dt),N(K n )-a [ X 2 (t)n(dt) 

(13) V JK n JK n 

The joint Fourier-Laplace transform of the limit (6 real and 7 > 0) is given 
by 

Eexp{i9S 00 (a)--/U 00 (a)} 

Joe (0,7) 



exp 



with the parameters given by 

a oo (e, 1 ) = (E\g(0, 1 T) 1 / a , 

/3 OO (0, 7 )=E(^,7)) <Q> , 
—28 

/M#,7) = — E[s(0,7)log5(0,7)], 

7T 



5(0,7) =0/ V(s)N(ds) + V27W / ^ 2 (s)N(ds)G 



forG a standard normal random variable independent o/N. Hence, S^a) is 
a-stable with scale cr^ (1,0), skewness P ^rrftfi and location l{ a=1 |/x oo (l,0), 
whereas Uoo(a) is a/2-stable with scale 



2(cos(7ra/4)E 



V> 2 (s)N(ds) 



a/2 ^2/a 
E\Z\ 



skewness one and location zero. If a = 2, the limit of the sample variance 
Uoo(a) is a point mass at 2r f^ d ^ 2 (s)X(ds). The stated limit for the sample 
mean still holds when a = 2; in this case, Soo(2) is a Gaussian random 
variable. 

As a special case, let us fix a = 2 in Theorem 2. The limiting squared 
scale 0^,(0,0) (half of the variance) of the partial sums is then 
2 



E 



ip(s)n(ds] 



( [ i>{s)r\{ds)\ + I ^ 2 {s)r\(ds) 

\JR d J JR d 



STABLE MARKED POINT PROCESSES 



9 



= r 2 / / ip(x)ij){x + t)X(dx)X(dt)+r [ ^ 2 {s)X{ds) 

= \r f T{t)X{dt) + \rr{Q>), 
2 JR d 2 

where r(t) = Cov(X(t), X(0)) = 2r J Rd ip(x)ip(x + t)X(dx) is the covariance 
(or codifference) function. Therefore, the variance of the limit r - 1 / 2 S~ 0O {2) 
is J Rd T(t)X(dt) + r(0), which agrees with equation (3.10) of [4] in the finite- 
variance case. 

Remark 3. Since the codifference function r(i) = 2||^||£ - ||V>(- + 1) - 
ip\\a is a natural generalization of the covariance function to a < 2, one may 
be tempted to conjecture that E| J Rd V'(s)N((is)| a = |(/ R d r(s)X(ds) +r(0)). 
This is, in fact, false, as evaluation on a simple ip will confirm. 

Corollary 1 follows from Theorem 2 by the continuous mapping theorem. 

Corollary 1. Under the same assumptions as Theorem 2, the self- 
normalized mean converges as n — > oo : 

J Kn X(t)N(dt) _^ S^ja) 

y/j Kn X*(tMdt) y/Unia) 

Remark 4. Note that no knowledge of a is required to compute the 
self-normalized mean. It is also interesting that the limit does not depend 
on r. The ratio is nonconstant, since a squared a-stable variable never has 
an a/2-stable distribution. 

4. Subsampling applications. This section of the paper describes how 
subsampling methods may be used for practical application of the results 
in the previous section. The idea is to use the subsampling distribution es- 
timator as an approximation of the limit distribution of our mean-centered 
statistic from a marked point process that depends on unknown parameters 
(including a); this procedure will yield approximate quantiles for its sam- 
pling distribution and thus confidence intervals for the mean can be formed. 
For more details and background on these methods, see PRW [8]. 

In order to use subsampling, it is desirable that the model satisfy certain 
mixing conditions. Strong mixing is one common condition on the depen- 
dence structure which is sufficient to ensure the validity of the subsampling 
theorems; see [10] for its introduction. Many time series models satisfy the 
assumption of strong mixing; for Gaussian processes, the summability of 
the autocovariance function implies the strong mixing property. In the case 
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where d=l, if our process (1) is symmetric, then the strong mixing con- 
dition is always satisfied, as Proposition 3 of [6] demonstrates. The mixing 
condition that is needed in general is somewhat more complicated and is 
defined in the next subsection. 

4.1. Subsampling with known index a. In this subsection, assume that 
a is known and greater than 1. Consider the location-shifted model 

(14) Z(t):=X(t)+n= [ ip(x + t)M(dx) + //. 

This is the appropriate model for stable stationary random fields with nonzero 
location. Note that since a > 1, the location parameter of X(t) is zero. For 
applications, we will suppose that our data are the observations {Z(ti) :ti £ 
K n } for some specified observation region K n and a random collection of 
~H{K n ) points ti generated by the Poisson random measure N. Our goal is 
to estimate the location parameter (i.e., the mean) [i with the sample mean 

Let az(k;h) be the mixing coefficients defined in PRW ([8], page 141). Since 
by (13) the sample mean converges, we can apply Theorem 6.3.1 of PRW [8] 
to this situation. Let K n (l — c) := {y G K n : B + y C K n } for B := cK n , 
where c = c n € (0, 1) is a sequence that tends to zero as the diameter of K n , 
denoted by 5(K n ), tends to infinity. We also require that c n 5(K n ) — > oo 
as n — > oo. Since 5(K n ) = 5{nK) = n5(K), this means that l/c n = o(n). In 
practice, since K n is not clearly defined by the data, one may take it to be the 
convex hull or rectangular hull of the observation points. Then it is simple 
to produce the scaled-down copy B of K n , once c is chosen. Let £iK„,B,y 
be the statistic p, evaluated on the set B + y for any y £ K n (l — c) and let 
Lk„,b{x) be the subsampling distribution estimator (6.8) of PRW [8]. We 
will assume (6.9) of PRW [8] as a condition on the mixing coefficients; this 
condition is easily satisfied in the special case d = 1 if the random field is 
strong mixing (which is always true for symmetric one-dimensional stable 
integrals, by Proposition 3 of [6]). Then by Theorem 6.3.1 of PRW [8] 

Lku,b{x) J(x) 

as n — ► oo, where J(x) is the cumulative distribution function ofr~ 1 ^ a S (yo (a). 

Remark 5. It follows that an asymptotically correct (1 — p)100% con- 
fidence interval for [i is given by 

[A - L x _ vj2 n{K n ) x i a -\ii - L p/2 n{K n ) l ' a -\ 

where L p is the pth quantile of Lk„,b, defined as inf{x : Lk u ,b{x) > p}- Note 
that explicit knowledge of a is necessary for this construction. 
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4.2. Subsampling with unknown index a. The method outlined above is 
often not immediately applicable because the rate of convergence T\^ n ) 
depends on a, which is typically unknown; thus, in practice, it may be 
necessary to estimate a. This can be done via a subsampling estimator of 
the rate, as discussed in [1] or PRW ([8], Chapter 8). These methods can be 
extended to the marked point process scenario in a straightforward fashion; 
details are omitted here, but may be obtained by contacting the authors. 
The single important difference from [1] is that the data-driven rate r^s+y) 
appearing in the subsampling distribution estimator must be replaced by a 
deterministic rate r^ B+y y, one can even use Lebesgue measure instead of 
the mean measure A since the use of logarithms in the estimator ensures 
that the differences in scale are irrelevant. 

Alternatively, one may be able to avoid the estimation of a by using a self- 
normalized estimate of the mean, for example, one may consider dividing 
by the sample standard deviation as in our Corollary 1. Suppose that &x n 
is an estimate of scale for X(t). Then we may form the ratio 

where r n is the appropriate rate of convergence such that the ratio has a 
nontrivial weak limit. The goal is to self-normalize so that t u will be a known 
rate, that is, a rate that does not depend on unknown model parameters. 
A leading example is to self-normalize so that an asymptotic result with 
t u — \/u holds. This is an improvement over the convergence rate of the 
(unnormalized) sample mean, where t u depends on a, which is unknown. 

We consider, generally, the scenario in which such a self-normalized con- 
vergence holds, that is, such that r u is a known rate. Corollary 1 furnishes 
an example of such a scenario; see the discussion following Remark 6 be- 
low. Now we adjust the definition of the subsampling distribution estimator 
accordingly: 

(15) L KniB (x) := \{K n (l - a))" 1 / 1 , «„, B , r «, n , dy, 

JK n (i-c) I k 2b. v Kx} 

with UK n ,B,y defined similarly to p,K n ,B,y On a practical note, it is possible 
that N(.B + y) = 0; in this case, we get a point mass in LK n ,B(x) at zero, 
but this will not affect the asymptotics. 



Theorem 3. Assume the convergences 

(VK n -/J.) C 



■l 



(16) a N(Kn) (fi Kn - fi) =^ V, 

dn{K n )OK n => W 
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for positive a n and d n such that r n = a n /d n and where W does not have posi- 
tive mass at zero. Let /i be a parameter and assume that t u has the form (6.6) 
of PRW [8]. Let c = c n S (0, 1) be such that c n — ► 0, but c n 5(K n ) — > oo. Fi- 
nally, assume the mixing condition (6.9) of PRW [8]. T/ien i/ie following 
conclusions hold: 

(i) Lk„,b{x) — ► J{x) for every continuity point x of J(x); 

(ii) if J(-) is continuous, then sup x \LK n ,B(x) — J{x)\ — > 0; 

(iii) letting 

ck„,b(1 ~t) = inf{x : L Kn>B (x) ^ 1 - t}, 
i/ J(x) is continuous at x = inf{x : J(x) ^ 1 — t}, then 

nr H K n) Kn - 0) < c Kn , B (l - 1)} - 1 - 1. 

Thus, the asymptotic coverage probability of the interval [fiK n — r N(K n ) _1 x 
cr:„,b(1 — p), oo) is i/te nominal level I — p. 

Remark 6. Since the subsampling distribution estimator involves Rie- 
mann integration, some numerical approximation must be made in its cal- 
culation; see Section 6.4 of PRW [8] for further details. 

As a specific application of Theorem 3, consider the model given by (14) 
and define the sample standard deviation by 



Hence, we have the convergence 
(17) N(K n ) 



1/2 (j*K n ~ fi) C Sooja) 



as n —> oo, which follows immediately from Corollary 1. Note that when 
a = 2, this convergence is valid, but Uoo(&) is degenerate [it is equal to the 
variance of X, which is 2r L d tp 2 (s)X(ds)]. The limiting ratio, however, is 
always absolutely continuous. Hence, we may apply Theorem 3 with r u = ^Ju 
and obtain an asymptotic 1—p confidence interval for /z, 

HK n ~ M-p/2— l=f=^^K n ~ Lfp/2- 



where p = LK n ,B{L p ). Note that no explicit knowledge of a is necessary for 
this construction; neither is it necessary to know the Poisson intensity r. 
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5. Simulation studies. We now focus on the above mean estimation sce- 
nario, with the asymptotic result (17). So, in this case t u = is a known 
rate. In this section we demonstrate the methods of Section 4 by means of 
several simulation studies. First, we illustrate how a stable marked point 
process can be simulated and then we discuss the practical implementation 
of the subsampling distribution estimators. Finally, our simulation results 
present the empirical coverage of the confidence intervals constructed via 
the subsampling methodology. 

5.1. Implementation. Following Samorodnitsky and Taqqu ([11], page 149) 
(also see Resnick, Samorodnitsky and Xue [9]), the series representation 
of (1) is 

X (t) = c x j° Yl ^7 1/a m + ^(^r 1A \ 

i>l 

provided that X(t) is symmetric a-stable. In this representation, 

• {ej} are i.i.d. Rademacher random variables, 

• {Tj} are the arrival times of a unit Poisson process (so they are sums of i 
i.i.d. unit exponentials), 

• {Ui} are i.i.d. random variables with probability density function q, 

• C a is a positive constant defined in (1.2.9) of [11], 

and all three of these sequences are independent of each other. We have 
freedom to select q, as long as it is a probability density function with 
support on the whole real line. In our simulations, we take q to be the 
Cauchy density in R d ; in practice, a heavy-tailed q has proven more effective 
in producing realistic simulations. For simulation, we adopt the following 
procedure. First, fix a and determine the observation region K n = nK, which 
can have a variety of shapes in M. d . Also, let A be Lebesgue measure for 
simplicity. If we want a sample of size n, we might choose K n such that 
A(K n ) = n, although this is no guarantee that we will obtain n data points. 

1. Simulate N(AT n ) which is Poisson with mean rate A(K n ). 

2. Simulate T±, . . . ,T^r Kn \ i.i.d. from a uniform distribution on K n . 

3. Simulate {e^}, {Tj} and {Ui} for i< I, where / is a predetermined thresh- 
old. 

4. Determine a vector v, which is the "center" of the region K n . Compute 

X(Tj) = C l J a Y, eFMUi + Tj - v)q{U i r 1/a 

i=l 



for j = l,...,N(K n ). 
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Use of the centering constant v is optional, but we found that it improves the 
quality of the simulation; theoretically, it merely introduces a deterministic 
lag and thus does not affect the distribution. One choice of v is to let each 
component be defined by the various centroids. As mentioned in [11], simu- 
lation by this method is unwieldy because convergence in I is slow. However, 
there is no alternative method for correlated stable random fields. By trial 
and error, we found that / = 100 gave a decent trade-off between simulation 
quality and speed; increasing I to 1000 gave little visible improvement to 
the simulation, while greatly retarding the speed. 

Next, we need to compute the subsampling distribution estimators given 
by (6.8) of PRW [8] and (15). The easiest method is to approximate the 
integrals via Monte Carlo. This is achieved by drawing a large number of 
random variables (we used 10,000) uniformly distributed on K n {l — c). One 
detail is that for a given simulated y, the number N(B + y) could be zero; this 
will create division by zero, problems for 0R,B,y, so by convention the latter 
is set equal to zero in this case. In practice, this creates a point mass at zero 
in the subsampling distribution, but the effect is lessened by taking larger c 
values. Another practical problem occurs when N(B + y) = 1, which creates 
a division by zero issue for the self-normalized subsampling distribution 
estimator. In this case, we should have 



where the sign depends on whether or not the data value at the single 
point t* exceeds the mean. In our computer code, we replace <JK n ,B,y by a 
very small value in this case, so the resulting ratio will be a large positive or 
negative value, corresponding to whether X(t*) is greater than or less than 
the mean. 

5.2. Results. Our simulation study focuses on dimension d = 2, with 
i^ioo given by both a square region (10 by 10) and a rectangular region 
(5 by 20). Using Lebesgue mean measure, this gives us samples of aver- 
age size 100. The centering vector v is simply given by the midpoints (5, 5) 
and (2.5, 10) for the square and rectangular regions, respectively. We sim- 
ulated stable processes for a values 1.1, 1.2, . . . , 1.9 and a "Gaussian" filter 
function ij)(x\,X2) — exp{— (x\ + X2)/2} and investigated the block ratios 
c = 0.1, 0.2, 0.3, 0.4. 

Method 1 assumes that a is known (see Section 4.1), whereas Method 2 
uses a self-normalized subsampling distribution, as in Section 4.2. Each sim- 
ulation was performed 1000 times. Tables 1-6 record the proportion of simu- 
lations for which the constructed confidence interval contained the true mean 
of zero (the standard errors are approximately 0.0095, 0.0069 and 0.0031 for 
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Table 1 

Coverage at nominal level 0.90, square region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


a 


= 1.1 


0.926 


0.961 


0.954 


0.869 


0.983 


0.909 


0.699 


0.577 


a 


= 1.2 


0.904 


0.954 


0.918 


0.857 


0.981 


0.933 


0.719 


0.591 


a 


= 1.3 


0.807 


0.928 


0.909 


0.810 


0.991 


0.942 


0.752 


0.638 


a 


= 1.4 


0.726 


0.911 


0.880 


0.821 


0.987 


0.961 


0.770 


0.637 


a 


= 1.5 


0.622 


0.854 


0.863 


0.806 


0.995 


0.954 


0.780 


0.687 


a 


= 1.6 


0.547 


0.777 


0.834 


0.777 


0.998 


0.965 


0.783 


0.721 


a 


= 1.7 


0.440 


0.761 


0.802 


0.740 


0.997 


0.974 


0.798 


0.716 


a 


= 1.8 


0.414 


0.712 


0.746 


0.723 


0.997 


0.984 


0.810 


0.706 


a 


= 1.9 


0.366 


0.670 


0.739 


0.691 


0.999 


0.986 


0.839 


0.727 



a = 0.1, 0.05 and 0.01 respectively). Clearly, both methods are sensitive to 
the choice of c. For Method 2, the coverage is a decreasing function of c 
and an increasing function of a, whereas for Method 1, the coverage is not 
monotonic in c and is decreasing in a. For most cases, an optimal value of c 
for Method 2 would seem to lie between 0.2 and 0.3, based on the observed 
pattern that c = 0.2 resulted in overcoverage and c = 0.3 in under cover age. 
In contrast, it seems that for high values of a, no value of c could be found 
to provide good coverage for Method 1. Neither method was particularly 
sensitive to the shape of the sampling region since results for the square 
and the rectangular region were similar. Although Method 1 had superior 
coverage for low a, the overall performance of Method 2 was superior. In 
general, the choice of c will depend on how important undercoverage and 
overcoverage are for a particular problem; c is also sensitive to the shape 
of K n . Finally, the coverage did improve with larger sample size, but for 
reasons of brevity, those results are not displayed here. 

Method 2's superior performance in simulation is interesting, since it also 
uses less information than Method 1 (it does not assume that a is known). 
This seems to corroborate the assertion that a data-driven normalization 
(such as the standard deviation) is superior in finite-sample to one based 
purely on a rate of convergence. If an extreme does not occur in the ob- 
served data, then normalization via a rate will overcompensate; conversely, 
if an unusual number of extremes (or an unusually large extreme) occurs, 
then the rate undercompensates. Use of the standard deviation instead will 
automatically adjust in an appropriate fashion, since it will be smaller in 
the first scenario and larger in the second. 
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Table 2 

Coverage at nominal level 0.95, square region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


a 


= 1.1 


0.987 


0.989 


0.987 


0.934 


0.999 


0.984 


0.810 


0.676 


a 


= 1.2 


0.987 


0.992 


0.959 


0.881 


0.996 


0.984 


0.829 


0.663 


a 


= 1.3 


0.943 


0.983 


0.999 


0.992 


1.000 


0.991 


0.845 


0.721 


a 


= 1.4 


0.925 


0.964 


0.938 


0.886 


0.998 


0.990 


0.857 


0.701 


a 


= 1.5 


0.844 


0.941 


0.929 


0.872 


0.998 


0.991 


0.856 


0.749 


a 


= 1.6 


0.763 


0.871 


0.894 


0.850 


1.000 


0.993 


0.873 


0.785 


a 


= 1.7 


0.661 


0.860 


0.880 


0.811 


1.000 


0.995 


0.874 


0.785 


a 


= 1.8 


0.616 


0.808 


0.820 


0.794 


1.000 


0.998 


0.890 


0.762 


a 


= 1.9 


0.519 


0.776 


0.801 


0.759 


1.000 


0.997 


0.901 


0.784 



Table 3 

Coverage at nominal level 0.99, square region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


Q 


= 1.1 


0.999 


0.999 


0.999 


0.981 


1.000 


1.000 


0.926 


0.784 


a 


= 1.2 


1.000 


0.999 


0.987 


0.945 


1.000 


0.999 


0.941 


0.763 


a 


= 1.3 


0.995 


0.995 


0.999 


0.997 


1.000 


1.000 


0.946 


0.806 


a 


= 1.4 


0.987 


0.993 


0.985 


0.945 


1.000 


1.000 


0.948 


0.802 


a 


= 1.5 


0.964 


0.978 


0.966 


0.925 


1.000 


1.000 


0.952 


0.831 


a 


= 1.6 


0.930 


0.943 


0.949 


0.911 


1.000 


1.000 


0.955 


0.851 


a 


= 1.7 


0.862 


0.927 


0.922 


0.876 


1.000 


1.000 


0.949 


0.863 


a 


= 1.8 


0.838 


0.895 


0.885 


0.863 


1.000 


1.000 


0.957 


0.834 


a 


= 1.9 


0.734 


0.873 


0.873 


0.827 


1.000 


1.000 


0.976 


0.845 



Table 4 

Coverage at nominal level 0.90, rectangular region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


Q 


= 1.1 


0.948 


0.965 


0.926 


0.874 


0.983 


0.861 


0.582 


0.492 


Q 


= 1.2 


0.893 


0.945 


0.896 


0.853 


0.987 


0.848 


0.582 


0.507 


a 


= 1.3 


0.835 


0.925 


0.883 


0.838 


0.992 


0.894 


0.637 


0.548 


a 


= 1.4 


0.737 


0.879 


0.848 


0.784 


0.992 


0.887 


0.632 


0.558 


a 


= 1.5 


0.631 


0.840 


0.804 


0.764 


0.997 


0.921 


0.675 


0.560 


a 


= 1.6 


0.571 


0.784 


0.788 


0.759 


0.996 


0.922 


0.687 


0.602 


a 


= 1.7 


0.472 


0.727 


0.731 


0.693 


0.998 


0.940 


0.707 


0.601 


a 


= 1.8 


0.418 


0.681 


0.712 


0.695 


0.998 


0.951 


0.728 


0.626 


a 


= 1.9 


0.396 


0.645 


0.709 


0.672 


0.997 


0.954 


0.737 


0.651 
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Table 5 

Coverage at nominal level 0.95, rectangular region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


a 


= 1.1 


0.991 


0.990 


0.979 


0.932 


0.998 


0.960 


0.673 


0.558 


a 


= 1.2 


0.979 


0.978 


0.966 


0.911 


0.998 


0.948 


0.681 


0.578 


a 


= 1.3 


0.966 


0.973 


0.943 


0.901 


0.999 


0.953 


0.716 


0.599 


a 


= 1.4 


0.920 


0.946 


0.911 


0.864 


0.998 


0.967 


0.724 


0.625 


a 


= 1.5 


0.859 


0.909 


0.882 


0.833 


1.000 


0.960 


0.752 


0.625 


a 


= 1.6 


0.797 


0.884 


0.857 


0.827 


0.999 


0.976 


0.771 


0.658 


a 


= 1.7 


0.719 


0.821 


0.812 


0.782 


0.999 


0.988 


0.795 


0.666 


a 


= 1.8 


0.618 


0.799 


0.787 


0.764 


1.000 


0.988 


0.806 


0.699 


a 


= 1.9 


0.567 


0.751 


0.773 


0.745 


1.000 


0.990 


0.806 


0.696 



APPENDIX A 



Proof of Theorem 1. We will principally treat the a < 2 case, since 
when a = 2 the results are already known; see [4]. We will consider the joint 
Fourier-Laplace transform of 

(n- d / a f X{t)\(dt),n- 2d l a f X 2 {t)X(dt)). 

We first consider the case where the filter function has compact support in 
the set L = {i£ W 1 :\xi\ < Then we can write 

Eexp(i9n- d / a [ X (t) dt - in ~ 2d l a f X 2 {t)dt\ 

Table 6 

Coverage at nominal level 0.99, rectangular region 



Method 1 Method 2 





c 


0.1 


0.2 


0.3 


0.4 


0.1 


0.2 


0.3 


0.4 


a 


= 1.1 


0.999 


1.000 


0.996 


0.975 


1.000 


0.994 


0.818 


0.655 


a 


= 1.2 


0.998 


1.000 


0.994 


0.958 


1.000 


0.998 


0.820 


0.656 


a 


= 1.3 


1.000 


0.996 


0.978 


0.952 


1.000 


0.996 


0.843 


0.685 


a 


= 1.4 


0.992 


0.982 


0.968 


0.931 


1.000 


0.999 


0.852 


0.694 


a 


= 1.5 


0.963 


0.970 


0.942 


0.889 


1.000 


0.997 


0.861 


0.707 


a 


= 1.6 


0.942 


0.948 


0.924 


0.886 


1.000 


0.996 


0.885 


0.747 


a 


= 1.7 


0.889 


0.912 


0.882 


0.852 


1.000 


1.000 


0.902 


0.736 


a 


= 1.8 


0.832 


0.878 


0.870 


0.826 


1.000 


1.000 


0.894 


0.777 


a 


= 1.9 


0.794 


0.856 


0.860 


0.813 


1.000 


1.000 


0.899 


0.776 
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:Eexp(i0n- d / a / X(t)dt + i^n~ d / a f X(t)M(dt) 

I JK n JK„ 



Eexp< in 



-d/a 



ip(x + t)W{dt) )M(dx 



where Wt is a Brownian motion with drift 9 and volatility and B is a 
Gaussian random measure independent of M. Conditional on this Brownian 
motion, the scale parameter is 

l/a 

ip(x + t)W{dt) 



CFr, 



1 

n d jRd 



the skewness is j3/3 n /<j%, with 



1 



and the location l{ a= iy fj, n , with 
-2(3 1 



ip(x + t)W(dt) dx 



dx 



7r n u 



A„ 



tjj{x + t)W(di) lo, 



A„ 



i/j(x + t)W(dt) 



dx. 



We only need to determine the convergence in probability of each parameter. 
We focus on the scale, as the other two are proved in a similar fashion. Let us 
define H x = J K ip(x + t)W(dt), which is Gaussian with mean 9 f K ip(x+t) dt 
and covariance 

E[(H X - E[H x ])(H y - E[H y })] = 2 7 f tf}(x + t)^(y + t) dt. 

Hence, all moments are bounded in K n since the variance is uniformly 
bounded by 27 / Rd ip 2 (t) dt. Let B = (0, l] d and Jj = J B \H x+ j\ a dx, so that 



\H T \ a dx 



E 



B 



\H x+j \ Q dx 



tv 



where Jj = if x + j + 1 ^ L for all x G B and t G K n . Since the summands 
are bounded in probability (since, for example, the 2a moment exists), we 
claim that the above expression is op(l) + —aY^,keK n J-k- ^ Jj 7^ 0> then 
there must exist some x G B and t G K n such that j + x + 1 £ L. For any 
set A, define A(l) = {y.d^y^A) < I}, where cfco is the sup-norm metric 
on R d . Then £ jeZd Jj = Y, je -K n (l+i) J i since 
j + x + t £ L =$>■ max |Jj + Xi + ti \ <l 

i=l,...,d 

=>• max \ji + tj| < I + 1 for some £ G -?T n 

i=l,...,d 

=>■ inf max | jj + tj| < Z + 1 
teK n i=l,...,d 
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and hence, for such a j, 



doo(j,-K n ) 



inf doofa-k) 

k£K n 



inf max \ji + k%\ < I + 1, 
kaKn i=l,...,d 



which implies that j £ —K n (l + 1). Now, we partition into a disjoint union 
-K n (l + 1) = -K„U (--FT n (/ + 1) \-K„). It is simple to show that -K n (l + 
1) = —nK(l + 1/n), where K = K±, the prototype region. Hence, —K n (l + 
1) \ — X„ = n{—K{l + 1/n) \ — -fC) and the count of integer lattice points in 
this set will be asymptotic (as n — > oo) to n d X(—K(l + 1/n) \ —K), by the 
definition of Lebesgue measure. However, the sets in the sequence —Kil + 
1/n) \ — K, for any fixed I, are inscribed in one another. So by continuity 
from above, 



X(-K(l + 1/n) \ -K) - A p| -#(Z + 1/n) \ 



< \{d{-K)) = 



,n>l 



since the intersection is a subset of the boundary. This shows that asymp- 
totically, the only terms that count in Y^j Jj are those in —K n . Now, these 
J-k variables are weakly dependent since the Gaussian variables H x and H y 
are independent if minj \xi — yi\ > I. Hence, the law of large numbers holds 
and 



(7, 



k£K„ 



It is necessary to determine E[J_/%]; here, we follow the argument given in 
the proof of Theorem 2 below, which is actually more complicated. Defining 
H x = J K d ip(x + t)W(dt), one can show that the difference between Eli^-^" 



and E|iJ T 



tends to zero. Now, 



E\H t 



ip(t) dt + 




tp 2 (t)dtG 



which no longer depends on x or k (and G denotes a standard normal 
random variable). With similar results for the other parameters, the joint 
Fourier-Laplace transform satisfies 



Eexpj-cr^l -ift^Pj + Wvl>=i} j 

-+Eexp{-A(*QC(0, 7 ) (l - z/ 00 ^ 



where 



0-00(0,7) 



E 



6$ + v / 27$ 2 G 



a\ 1/a 



+ i\{K)u 00 (9,j)l {a=1} \, 
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with $ = J Rd i/}(s)\(ds) and $2 = \J Jr^ ip 2 (s)X(ds). This convergence follows 
from the convergence in probability of the parameters because the expo- 
nential function in the transform is bounded. The existence of Soo(a) and 
L^oo (a) follows from the continuity of the limiting Fourier-Laplace transform 
at (0,0). Letting 7 = 0, one recognizes the Fourier transform of an a-stable 
random variable with parameters as described in Theorem 1, which is ^(a); 
if = 0, then one obtains the Laplace transform of a positive a/2-stable ran- 
dom variable [/00(a), whose scale parameter is calculated in the statement 
of the theorem. 

We remark that it is sufficient to consider convergence of the joint Fourier- 
Laplace transform as shown by Fitzsimmons and McElroy [2]. Finally, we 
must remove the truncation of the model. This is similar to (and actually 
easier than) the argument presented in the proof of Theorem 2 and is not 
repeated here. Thus, the proof is complete. □ 

Proof of Theorem 2. Note that since N is a PRM, it follows from 
the law of large numbers, independent scattering, spatial homogeneity and 
shift invariance of A that as n tends to infinity. Thus, it suffices 

to examine the limit behavior of 

(18) fn _ « / X(t)N(dt),n-% [ X 2 {t)N(dt)) 

since A(K n ) = rn d \(K); in the end, we must multiply our results by r" 1 /" x 
\{K)~ l l a . Let us first consider a filter function ip with compact support in 
the set L = {x G M. d :\xi\ < iVi}. Then we can write 



Eexp(i6n- d/a [ X(t)N(dt) - j n - 2d/a [ X 2 (t)N(dt)\ 

= Eexp{i6n- d / a [ X(t)N(dt)+i^n~ d/a [ X(t)G(t)N(dt) 
I JK n JK n 

x in ~ d / a / (el ip(x + t)N(dt) 



Eexp< 



+ v/27 / ^(x + t)G{t)N(dt) M(dx 



by introducing a process of i.i.d. standard normal random variables {G(t)} 
that are independent of M. Conditional on N and the G(t)'s, this is an 
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a-stable random variable with scale 



1 

n d jRd 



9 if>(x + t)N(dt) + V27 / ip(x + t)G(t)N(dt) 



a \ l/a 

dx 



skewness (3f3n/a^, with 

fa = -1 [ (e [ tP(x + t)Mdt) + / ^(x + t)G(t)N(dt)] { ] dx, 
and location l{ a= n/XN, with 

m = —-^i [(of 1>(? + <)N(df) + V*y [ #z + t)G(t)N(dt)) 

vr n a JRd \ J Kn J Kn J 

x\og(e [ tp(x + t)N(dt) 

+ / ifr(x + t)G(t)N(dt)] dx. 

JK„ J 

Hence, to determine convergence, we will establish the limits in probability 
of each of these parameters and thereby determine the joint Fourier-Laplace 
transform of the sample mean and sample variance. We provide an explicit 
proof of the convergence of a^; the proofs for the other two parameters are 
similar. Let us write 



H x = 9 Mx + s)N(ds) + V27 / i>{x + s)G(s)Wds) 

JK JK 

so that <rg = n~ d J Rd \H x \ a dx. Now, {H x } is, conditional on N, a Gaussian 
process with mean 6 f K ip(x + s)N(ds) and covariance 

Cov N [H Xl H y }=K[(H x -EH x )(H y -KH y )\N]=2-f J K iP(x + s)i{j(y + s)N(ds). 
Taking a second expectation shows that 

Cov[H x ,H y }=K[(H x -EH x )(H y -EHy)]=2 1 [ ^(x + s)^(y + s) ds, 

J K 

which is zero if \xi — yi\ >l for at least one i between 1 and d. Hence, these 
variables are /-dependent (taking the oo-norm for M. d ), which will be useful 
in establishing a weak law of large numbers. It is also true that all moments 
of H x exist (even as n increases): 

\H X \<\6\ [ \ijj(x + s)\N(ds) + y/27 / |^(x + s)||G(s)|N(ds) 

JK n JK n 

= \0\ f \^(x + s)\N(ds) + y/frjJ [ ijj{x + sfn{ds)\G x \ 

JK n V JK n 

< [ \i/;(x + s)\N(ds)(\e\ + V^\G X 

JR d v 
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where G x is a dependent sequence of standard normal random variables. 
The equality in distribution follows from the stability property of Gaussian 
random variables and the fact that integration with respect to N is, condi- 
tional on N, a discrete sum. The final random variable is Poisson with mean 
/ K d |V>(s)|N(c2s) multiplied by an independent Gaussian with mean \0\ and 
variance 27; all moments therefore exist, even as n — > 00. Next, we write 



1 



1 



— / \H x \ a dx = — 7 } 

n a ,Jb 



H x+j \ a dx 



using the same notation as in the proof of Theorem 1. By the same argu- 
ments, up to terms going to zero in probability, this expression is the same 
as -^3- ^2ikaK n Ib \ H-x—k\ a dx. Because the variables = f B \H x _j i \ a dx are 
a random field in k with finite dependence, the weak law of large num- 
bers applies. Hence, erg = op(l) + sYlkeK n E[*/-fc] as n —> 00. It remains to 
compute the expectations. Let 



H x = 9 ^{x + s)N(ds) + y/¥/ / ip(x + s)G(s)n(ds) 



so that 



E\H X . 



^(y)N(dy) + y/ZyJ / ^(y)N(dy)G : 



T x—k 



where G x is a mean-zero Gaussian sequence with known correlation structure 

J Rd ^{y)^{y + h)dy 



E[G x G x+ h] 



J Rd ip 2 (y)dy 



We claim that the average E[J] n is asymptotically E|flo| • The absolute 
difference is 



(19) 



\ nj-k}--^ E E / \Hx- k \ a dx 

n k^ n n k^ n Jb 

1 _ /•„,,„ _ , ~ , 



k&K n 

< A E / n\H x -k\ a -\H x - k \ a \dx. 
We have the following inequality, stated as a separate lemma. 



Lemma 1. I/O < a < 1, then 

\\a\ a - \b\ a \ < \a-b\ a 

and if 1 < a < 2, then 

\\a\ a - \b\ a \ <\a- b\ a + 2max{|a|, |6|}|a - b\ a ' 2 
for all real numbers a and b. 
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Proof. The case a < 1 is well known. So, suppose that a > 1. If \a\ > |6|, 
then 

\ a -b\ a = (\a-b\ a ' 2 f 

> (| a |«/2 _ | 6 |«/ 2) 2 

= (|a| Q / 2 - \b\ a ' 2 ){\a\ a l 2 + \b\ a > 2 - 2\b\ a ' 2 ) 

= (\a\ a - \b\ a ) - 2\b\ a l 2 {\a\ a l 2 - \b\ a/2 ), 
where the second line follows from a < 2. This, in turn, implies that 
\a\ a - \b\ a <\a- b\ a + 2\b\ a/2 {\a\ a / 2 - \b\ a / 2 ) 

<\a- b\ a + 2max{|a| a/2 , |6| a/2 }|a - b\ a ' 2 . 
Now, the case |6| > \a\ is similar, which proves the lemma. □ 

Using this lemma, it suffices to examine the average expected integral of 

\H X — k H x —k\ 

<2 S ( 6 [ %j){x - k + s)N(ds) + y/5y [ i>{x - k + s)G(s)N(ds) 
\ Jk° Jk- 

where 5 is either a or a/2. If 5 = a/2, we have 
— H / ^\H x -k — H x -k\ ^ 



keK n 



< 2 "/ 2 i 7 y 

n d 



k£K r , 



BJKS, 



(x-k + s)\ a/2 dsdx {\9\ a ' 2 + (2 7 )°/ 4 ), 



using the fact that EN(ds) = ds. If 5 = a, using the result that 



E 



K: 



tp(x - k + s)N(ds] 



+ E 



2 7 / ij)(x - k + s)G(s)Mds) 

K c 



< \9\ a E 



k + s)\ a/2 N(ds 

+ (2 7 ) Q / 2 E( / Wx - k + s)\ a/2 \G{s)\ a/2 n{ds) 
\6\ a f \ip(x - k + s)\ a ds 

\ip(x — k + s)\ a ds 



+ (27) 



(x — k + s 



I a/2 



ds 



K' 



E\G\ a , 
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we easily obtain 



<T—, Y [ [ \i)(x-k + s)\ a dsdx(\d\ a + (2 7 ) a/2 E|Gf) 
+ 2a -dH [([ \^-k + s)\^ 2 ds) 2 dx((2 7 r/ 2 E\G\ a ). 

71 keK n jBXJlC n J 

So, in order to show that (19) tends to zero as n — > oo, it is enough to 
demonstrate that 

-dH I il Mx-k + s)\ 5 dsYdx 

tends to zero, for 5 equal to either a or a/2 and <\> equal to 1 or 2. Note that 
through a simple change of variable, this becomes 



(s — x)\ ds) dx 



1 

n d JK n \Jk° 

It is a simple but tedious analysis exercise to show that this tends to zero as 
n — ► oo, for 5 = a or a/2 and <j) equal to 1 or 2. The result of this analysis is 
that 



X(K)E 



where Z is a standard normal random variable. Note that this limiting scale 
parameter is not random. Using similar techniques, one can show that ,% 
converges in probability to 

(a) 



\(K)E[e [ ^(y)N(dy) + v^7a/ / ^ 2 (y)N(^)z) = X(K) ^(9 , ^) 



and that fin tends to 



IT 



^(y)N(dy) + v^7a / ^ 2 (y)N(dy)Z 



X log / i/>(y)N(dy) + V27i / ^ 2 (y)N(dy)Z 
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which is X(K)}l 00 (6, r y). Now, multiplying our results by r l l a X(K) our 
joint Fourier-Laplace transform is 



Eexp 

(20) 



Eexp< — r 



- 1 ^(0, 7 )(l-z^||^|)+,r- 1 /ioo(e,7)l {Q =i}}, 



by the dominated convergence theorem. The existence of S 00(a) and Uoo(a) 
now follows from the continuity of the limiting Fourier-Laplace transform 
at (0,0). Letting 7 = 0, one recognizes the Fourier transform of an a-stable 
random variable with parameters as described in Theorem 2, which is Soo(a); 
if 9 = 0, then one obtains the Laplace transform of a positive a/2-stable ran- 
dom variable Uoo(a), whose scale parameter is calculated in the statement 
of the theorem. 

Finally, we must remove the truncation of the model. Define 

X t (t)= I ip t (x + t)M(dx) = [ ip(x + t)l lD (x + t)M(dx), 
Jm d JR d 

where D = (— 1, l] d so that ID is a d-dimensional cube centered at the origin 
with width 21. Then 

n -d/a f X (t)N(dt)-n- d / a [ X m (t)N(dt) 

JK n JK n 

= n~ d/a f f %b(x + t)ln DV (x + t)N(dt)M(dx) 

must tend to zero in probability as I — > 00 for any fixed n. Taking the a power 
of the scale of the above random variable, conditional on N, we obtain 



If a < 1, this is bounded by 
1 



^b(x + t)l {lDV (x + t)n(dt) 



X(dx). 



\^(x + t)\ a l {lDr (x + t)N(dt)X(dx) 



1 

n d JK n J{iD} c -t 

my)rX(dy)^^. 



(x + t)\ a X(dx)N(dt) 



Thus, the limit superior as n — > 00 of J Rd \ J K ip(x + t)l^ D y(x + t)N(dt)\ a x 
X(dx) tends to zero as I — > 00. When a > 1, we use the bound of 
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sup 

x 



Hx + t)l {lDr {x + t)N(dt) 



IK, 



n d Jjnd 



K„ 



ip(x + t)l {lD} c(x + t)N(dt) 



a-l 



\{dx) 



1 



< / \^(t)\H(dt) ■ — / / \^{x + t)\ a - L l {lD y{x + t)n{dt)\{dx) 



Q- 



{wy 



where Q = J R d \ip(t)\N(dt) is a positive random variable that is bounded 
in n. So the limit superior in this case also tends to zero as I — ► oo. Similar 
arguments can be applied to the sum of squares and since the limiting joint 
Fourier-Laplace transform is continuous in I, we can take the limit as / — > oo 
on both sides of our joint weak convergence (20). This completes the proof. 
□ 



Proof of Theorem 3. This proof has the same structure as that of 
Theorem 6.3.1 from PRW [8]. First, we show that T^^+y) 1S almost surely 

asymptotic to Tx(B+y) = T A(B) • As in the proof of Theorem 2, A(B+yj — »1 as 
n — > oo; this uses the condition that c n 5(K n ) — ► oo. It follows from the form 
of t(u) that ^(5+") Let x be a continuity point of J(x), the cumulative 
distribution function of the limit random variable J. Then 

/ VK n ,B,y ~ fiK n \ , 
T N(B+y) 1-* < X 



(B+y) j <X + T N(B+y ) > 

V °K n ,B,y J K J \ &K n ,B,y J J 



For any t > 0, let 



R Kn ,B(t) = HK n (l - c))- 1 f 1 ( *K n -^ <f ,dy 

JK n {l-c) {TN { B + y)( &KnBy Ht} 



— A(if„(l C)) / 1 {d N( B + y)VK n ,B,y>a fH B + y)(flK„-V)/t}dy- 

JK n (l—c) 

Now, for all 5 > 0, a^ B+y ^ {fiR n — A 4 ) < 5 with probability tending to one, 

since a^B+y) / a N(K n ) — > follows from c n 5(K n ) — ► oo. So, with probability 
tending to one, 

RK n , B (t) > KK n (l - c))" 1 f l {d nB+y) a Kn , B , y >s/t} dy. 

J Kn (1 — C) 



STABLE MARKED POINT PROCESSES 



27 



If S/t is a continuity point of W, then the above expression tends to ¥[W > 
S/t], by Theorem 6.3.1 of PRW [8]. Since W has no point mass at zero, we 
can make RK n ,B{t) arbitrarily close to 1 by choosing 5 sufficiently small. So, 

for all t, RK n ,B{t) — ► 1- Next, using the inequality 

fr*(B +B )( )<^M( B+y) (^^)} 

)<«+t} fe( B+ ,)(^)>t} 

we can establish that 

^A^l-c))- 1 / 1 , A Jfn , a ,„-, . < dy+(l-i?K n)B (*)). 

JK n (l-c) e Kn J iy )< x +t} 

Now, by the first convergence in (16), we may apply Theorem 6.3.1 of 
PRW [8] to 

/ fj-K n ,B,y ~/A 

T N(B+y) 1 

y '\ °K n ,B,y J 

and obtain, for any e > 0, LK n ,B{x) < J(x + i) + e with probability tending 
to one. At this point, let t tend to zero. Similar arguments produce the 
opposite inequality Lk„ : b(%) > J(x + 1) — e. Now letting e — > 0, we obtain 

L,K n ,B(x) — 51 «^(^), as desired. The proofs of (ii) and (iii) are similar to the 
proof of Theorem 2.2.1 in PRW [8]. □ 
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